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At Stanford University over the last five years, the Institute has 
been developing a working computer-assisted instruction (CAl) system for 
classroom use that follows two distinct approaches: tutorial and supple- 

mentary drill and practice. The tutorial approach to CAI uses the computer 
as a "teacher" to present new concepts as well as to determine subsequent 
student work with the concepts. In contrast, drill-and-practice systems 
supplement classroom instruction by improving the skills and concepts 
introduced by the classroom teacher. 

In the spring of 19^5, a CAI drill-and-practice program was initi- 
ated in an elementary school. To implement this program a computer at 
Stanford was used; telephone lines connected the computer to the teletypes 
located at the school. Fourth-, fifth-, and sixth-grade students received 
daily drills in arithmetic (Suppes, Jerman, and Groen, I 966 ) . Beginning 
in the fall of 19^5^ this operation was expanded (Suppes, Jerman, and 
Brian, 1968 ); by the fall of I966, computer-controlled drills were given 
to approximately 900 students in six different schools. During the past 
academic year, I967-68, these drills reached over 2,000 students per day. 
Elementary schools in Kentucky and Mississippi, where children received 
daily drills in arithmetic, were also linked to the c .'jral computer at 
Stanford. 

The research reported here is a small part of an investigation of the 
potential use and value of CAI drill-and-practice systems. This study, 
in particular, reports a new use for such systems. The students who 
participated were first taught the mechanics of how to use a computer- 
based teletype to solve arithmetic word problems. Following this, a series 
of word problems was presented to them. A central characteristic of these 
problems was the requirement of a quantitative answer, but the arithmetical 




1 








opGra’bions wgpg not explicitly indicated. An example of a problem in 
arithmetic providing the pupil with an opportunity to use his knowledge of 
subtraction is the following: 

Tom collected 5OO seashells and placed of them 
in a showcase. How many shells were not placed in 
the showcase? 

We attempted to determine the factors related to problem difficulty by 
analyzing the solutions of the problem series. An example of what we 
mean by a factor related to problem difficulty is the length of the problem. 
A natural assumption is that the larger the number of words in a problem, 
the harder the problem is to solve. 
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The discussion of the regression model follows Suppes, Jerman, and 
Brian (1968). What is desired is an analysis of factors that lead to 
varying difficulty. We would like to attach weights to the various 
factors that may be objectively identified in each item, and then to 
use estimates of a few such weights to predict the relative difficulty of 
each of a large number of items. To this end, the aim of the present paper 
is to formulate and test some linear structural models that lead to para- 
metric predictions of relative difficulty. 

For the word problems analyzed in this paper, the central difficulty 
was to identify the factors that contributed to the complexity of the 
problem. As a matter of notation, the jth factor of problem i in the set 
of problems is denoted by . The statistical parameters estimated from 
the data are the weights attached to the factors. The weight assigned to 
the jth factor is denoted by a . . It should be emphasized that the factors 
identified and used in the model presented in this paper are not abstract 
constructions from the data. Rather, they are always objective factors 
identifiable by the experimenter in the problems themselves, independent 
of any data analysis. Which factors turn out to be important is a matter 
of the estimated weights a.. All the factors used in the analyses presented 
here have an intuitive and direct relevance to commonsense ideas of diffi- 
culty, and their definitions are straightforward. 

Consider the analysis of the response data. Let p^ be the observed 
proportion of correct responses for a group of students on problem i. The 
central task of a model is to predict the observed proportion p^. The 
natural linear regression model in terms of the factors X. . and the weights 
a^ is simply 



p . = E a .X . . + a 
1 i J 



0 * 



A difficulty with this model, however, is that probability will not neces- 
sarily be preserved as the estimated weightings and the identifiable factors 
are combined to predict new observed proportions of correct responses. 
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In order to guarantee preservation of probability, that is, to insure that 



predicted p.’s will always lie between 0 and 1, it is natural to make the 

^ 2 
following transformation and to define a new variable z^: 



= log 



(i-Pi) 



Pi 



(1) 



We then use as the regression model 



z . = 2 a .X. . + a„ . 
1 .1 0 IJ 0 



(2) 



It should be noted that the reason for putting 1 - p^ rather than p^ 



in the numerator of equation (l) is that it is desirable to make the 



variables z^ increase monotonically in difficulty® For example, if the 



length of a problem increases with the difficulty of the problem, it is 
desirable that the model reflect this increase directly rather than 
inversely. 

The variables we consider are of two types. The first type are 
0,1- variables. Such variables would be appropriate, for example, in dealing 
with a problem that requires a conversion of units. If a problem requires 
a conversion of units, such as from months to weeks, the conversion variable 
for that problem receives a value of 1, and 0 otherwise. The second kind 
of variable is one that assumes a finite set of values, but the set is greater 
than 2. Such a variable would be appropriate, for example, in dealing with 
the length of the problem; the length variable receives a value which is 
equal to the number of words in the problem. 

Two other variables of the second type are the operations variable and 
the steps variable. The operations variable refers to the minimum number 
of different operations required to reach the correct solution. For a given 
problem, this variable could take on a value of 1, 2, 3^ or 4. The steps 



variable refers to the minimum number of steps required to reach the correct 

5 



solution. These two variables may be distinguished more clearly if we 
consider a problem which requires two or more computational processes before 
the answer can be found. This type o" problem is called a "multiple -step" 
problem, and "multiple- step," as used here, refers not to the details of 
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the processes, but to the number of binary operations--addition, subtraction, 
multip-lication, or division-'-required to obtain the answer » A problem that 
asks the student to find the average of 11 numbers would give a value of 11 
to the steps variable and a value of 2 to the operations variable o 

A few words must be said about the length variable » Sentence length 
is frequently proposed as the most obvious and plausible variable in deter- 
mining sentence difficulty. This factor is generally determined by total 
count of the number of words in the sentence. Studies in language acqui- 
sition (Miller and Ervin, 1965; Ervin, 196i^•) gave evidence of the gradual 
progression of children’s language development from one-word sentences, 
holophrases, to two-word pivot sentences, to sentences consisting of greater 
numbers of words. In imitation of adult sentences, children tend to use a 
"telegraphic code," a sentence form which is a shortening of adult sentences 
that retains only content words, Braun- Lamesch (1962) found that younger 
children cannot recall whole sentences easily. Because this evidence in- 
dicates that young children in early language development lack the ability 
to process long sentences, it seems safe to say that long sentences are more 
difficult for children to comprehend than shorter sentences. For the present, 
we shall generalize these results and assume they imply that longer word 
problems will be more difficult than shorter ones. In subsequent studies, 
however, we hope to look at the actual, syntactic structure of the sentences, 
which should be a more meaningful index of difficulty than mere word count 
alone , 

The sequential variable is the first 0,1-variable , Post (1958) completed 

a carefully designed study which investigated the effects of several factors 

on problem-solving in arithmetic. The factors studied were: (a) size of 

numbers; (b) superfluous numerical data; (c) number of steps; (d) familiarity 

with setting; (e) type of operation; and (f) symbolic terms. Each factor 

was investigated on two levels that were studied in conjunction with each 

6 

level of the remaining factors, giving sixty-four (2 ) treatment combinations 
in all. The findings indicated that the type of operation was the most 
important factor, although familiarity of setting and superfluous numerical 
data were significant also. These results suggest a new factor, which we 
chose to call the sequential variable. If a problem may be solved by the 
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same operation(s ) , in the same order, as the problem that preceded it, 
the sequential variable for that problem is assigned the value of 1, and i 

0 otherwise. The possible importance of this factor in the present context 
was suggested to us by Professor Leon Henkin, Successful use of it in the ; 

analysis of fractions is found in Suppes, Jerman, and Brian (1968, Chap. 7). i 

The verbal - clue variable is the second 0,1-variable. Brownell and 
Stretch (l93l) felt that a problem could be analyzed into several elements S 

or factors, one of which was a verbal clue to the operations. This factor | 

was not varied systematically, and so no conclusions could be drawn about 
it, Brownell and Stretch did suggest that there were other factors, yet 
unknown, which influence problem-solving. We are again indebted to Leon 
Henkin for suggesting this variable in the present context. He chose to 
define it as follows: 

1. The verbal clue for problems requiring a single addition is the word 
"and’’j if the problem contains this word, the verbal - clue variable 

for that problem is to be assigned a value of 1, and 0 otherwise. < 

2. The corresponding verbal clues for the other operations are: 

a. "left” or a comparative for subtraction; 

b. "each" for multiplication; I 

c. "average" or "each" appearing in the question sentence of the I 

problem for division, I 

3. Problems requiring multiple operations must contain all of the verbal I 

clues pertaining to the required operations in order that the verbal - | 

clue variable be assigned a value of 1. | 

The conversion variable is the last 0, 1- variable. If a problem I 

requires a conversion of units, such as from months to weeks, the conversion ? 

variable for that problem is assigned a value of 1, and 0 otherwise. The 
importance of this variable was suggested by the results of an informal 

; 

pilot study described below. 
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In summary, the variables we investigated are; 

= the operations variable, that is, the minimum number of different 
operations required to reach the correct solution; 

X 2 = "the steps variable, that is, the minimum number of steps required 
to reach the correct solution; 




the length variable, that is, the number of words in the problem; 

the sequential variable, assigned a value of 1 if the problem is 
of the same type (i.e., can be solved by the same operation(s) ) as 
the problem that preceded it, and 0 otherwise; 



X = the verbal-clue variable, assigned a value of 1 if the problem con- 

5 

tains a verbal clue to the operation(s) required to solve the problem, 
and 0 otherwise; 

X/' = the conversion variable, assigned a value of 1 if a conversion of 

6 

units is required to solve the problem, and 0 otherwise. 



II. DESIGN AND EXPERIMENTAL PROCEDURE 



A. Subjects 

The 27 subjects used in this study were taken from an accelerated 
mathematics group composed of bright fifth-grade students from four different 
elementary schools near Stanford University. The children all came from 
middle-class, suburban communities. The students had received teletype 
instruction in logic and mathematics drill-and-practice, so familiarizing 
them with the machine was not a problem. 
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B. Equipment 



The student terminals used in this project were commercially available 
teletj^e machines, connected by private telephone lines to a computer at 
the Institute for Mathematical Studies in the Social Sciences at Stanford. 
There were 10 teletypes, all operating in a single classroom at one of the 
elementary schools. The children from the other three schools were bussed 
to that school for one hour every day. When not operating the teletypes, 
the children in the special mathematics group received classroom instruction 
in elementary mathematics from Mr. James Newland, a teacher associated with 
the project. 

The control functions for the entire system were handled by the PDP-1, 
a medium-sized computer with a 52^000-word core and a 4,000-word core inter- 
changeable with any of 52 bands of a magnetic drum, together with two large 
IBM-1501 disc files. All input-output devices were processed through a 
time-sharing system. Two high-speed data channels permitted simultaneous 
computation and servicing of peripheral devices. 
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C. Instructional Program 

To initiate a lesson, a student typed "P" (for problem solving) followed 
by his assigned number and his name. When this was correctly done, the pro- 
gram began. If the student made an error or gave a fictitious name, such 
as Superman , he was asked to try again. 

The computer consulted the student’s file and began with the item 
following the last one completed. The items were divided into two parts, 
with the set of instructions presented before the set of problems. 

The set of instructions . Instructions were presented, via computer, 
to teach the students how to command the computer to perform operations on 
given numbers. Table 1 lists and gives an example of each of the abbreviated 
operation names that the student learned in the instruction set. 



Insert Table 1 about here 
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TABLE 1. 



Operation Abbreviations Taught in the Instruction Set 





Comments 


X 


THE ANSWER KEY 


The line number followed by X 










indicates what line the answer 




G 


1) 


21 


is on. 




IK 








A 


ADD 










G 


1) 


56 






G 


2) 


kl 






1.2A 


5) 


77 




S 


SUBTRACT 








G 


1) 


500 






G 


2) 


kS 






1.2S 


5) 


k^2 




M 


MULTIPLY 








G 


1) 


59 






G 


2) 


i}- 






1.2M 


5) 


256 




Q 


DIVIDE 






Q rather than D was used for divide 


G 


1) 


77 


because D was used for something 




G 


2) 


7 


else in the system. 




1.2Q 


5) 


11 




E 


ENTER 






E is used to enter a number that is 




G 


1) 


in 


not entered by the computer program. 




E 


2) 


_7 


For example, in a problem that asks 










the student to find the number of 










days in 12 weeks, the student would 
be required to enter the number If, 
the number of days in one week. The 
number 12 would be entered by the 
computer as a "given number." 



Note: Student entries are underlined. 
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The following sequence of interactions between the student and the 
computer illustrates how a problem is solved in ^his context. Student 
entries are underlined. The computer first types out the problem, and 
then types out the numbers in that problem. The student sees on the 
printout sheet before him; 

Tom collected 500 seashells and placed 43 of them 
in a showcase. How many shells were not placed in 
the showcase... 

G 1) 500 

G 2) 43 

"g" stands for "given number."^ 

The student then responds by telling the computer the operation he 
wants the computer to perform, and the line numbers to which the opera- 
tion should apply. In the present case, the student ordinarily tvpes 
out "l.2S" meaning "from the number shown on line 1 subtract the number 
shown on line 2." The computer responds by typing the result of apply- 
ing the operation, or by typing an error message if the operation 
could not be applied validly . 

The student also learned to indicate the answer by typing the 
line number followed by an X. The complete protocol for a correct 
response in the above example, then, might be: 

Tom collected 500 seashells and placed 43 of them 
in a showcase. How many shells were not placed in 
the showcase... 

G 1) 500 

G 2) 43 

1.2S 3 ) 457 

Correct 
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(Again, student entries are underlinedo ) If the answer is incorrect, 

"answer is wrong" appears in place of "correct." The protocol for a 
response which elicits an error message might be; 

Tom collected 500 seashells and placed 48 of them in 
a showcase. How many shells were not placed in the 
showcase . . . 

G 1) 500 

G 2) 45 

1.2AD There is no rule name "AD." 

There are often many ways to solve a given problem. Which rule to 
use and the details of use are matters of strategy determined by the 
experience and ingenuity of the student. The computer allows any valid 
step, regardless of whether it helps reach the solution. Any combination 
of steps reaching a solution, valid within the rules, is entirely acceptable, 
however idiosyncratic. 

In the instruction set, easier examples preceded more difficult ones. 

On several of the problems, the student was invited to ask for help after 
a certain time lapse by the message, "Type H and a space if you want a 
hint." No hints were available on multiple-choice problems; the student 
had to guess until he got the problem correct. 

The computer did four things while the student was trying to reach 

a solution. 

1, It examined each instruction by the student to see if the syntax 
was correct and was a valid step. If the instruction was incorrect, 
the computer printed out an error message. 

2, It performed whatever valid step the student commanded, regard- 
less of whether the step contributed to the correct solution. 

3, It compared the solution indicated by "X" with the desired 
solution. If they were identical, it terminated the problem 
after typing "correct." If the solution was .incorrect',, it typed 
"answer is wrong." 

4, On certain problems it offered a hint after a fixed -time lapse. 

The hints programmed were usually starting hints. If the 
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student had already completed steps, the hint might no longer 
be appropriate. Hints were available only for certain problems 
in the instruction set, not for those in the problem set. 

At the end of the six -minute session, the student was signed off 
automatically as soon as he completed an unfinished problem, or if he 
had a two-minute interval with no response. 

word - pr oblem set . Because these fifth-grade students were from 
an accelerated group, the 68 word problems used in this study were designed 
to be of appropriate difficulty for sixth-grade students. The students 
used the rules they learned in the instruction set to solve these problems. 

As was done in the examples in the instruction set, the computer, after typing 
out each problem, typed out all the numbers given in the problem as "given 
numbers." The student then told the computer what to do with these numbers. 
Figure 1 illustrates how a student went about solving a word problem in 
this way. The type wheel of the teletype was positioned at the left-hand 



Insert Figure 1 about here 




side of the paper. After the student made his response, the computer 
positioned the type wheel at the center of the page, typed the line number, 
and the result of the operation the student had commanded the computer to 
perform. If the final answer was correct, the computer typed the message 
correct and went on to the next problem. If the final answer was in- 
correct, the computer typed "answer is wrong" and went to the next problem. 

The students were not allowed to use pencil or paper when working 
on the teletype. Each exercise was worked on the machine, so that all 
responses could be recorded. 

The student was signed off, as during the instruction set, with a 
"goodbye" message, and "please tear off on the dotted line." 
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COMMITTEE MEMBERS BOUGHT 3 JARS OF CANDY 
WITH 14 OUNCES IN EACH JAR, AND 2 BOXES OF CANDY 
WITH 27 OUNCES IN EACH BOX. THEY PUT THE CANDY 
INTO BAGS THAT CONTAINED 4 OUNCES EACH. 

HOW MANY BAGS OF CANDY DID THEY FILL? 
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D. Informal Pilot Study 

Prior to running the experiment, three students at the elementary 
school served as subjects in an informal pilot study. Each subject was 
asked to read each problem aloud, to point out any difficult words in 
the problem, and to indicate how to solve the problem. The student did 
not actually perform the operations. The entire interview was recorded, 
later studied, and the following information was extracted from the record- 
ings, some of which resulted in the following modifications of the program 
or problems . 

1. Four- and five-digit numbers did not have commas separating the 
hundreds- from the thousands-digit , For all three students, this led to 
difficulty in reading the numbers for the five-, but not the four-digit’ 
numbers. The five-digit numbers were changed to include commas. 

2. All three students had difficulty with the following problem: 

’’Jerry counted names listed on a page in the telephone 
directory, and there were 55 psgss in the book. How many 
telephone subscribers were listed in his directory... 

"Directory" was changed to "book." "Subscribers" was changed to "names." 

5. The word "equatorial" in the phrase "the equatorial diameter" 
was dropped; all three students found it difficult to read the word. 

4, Reading difficulty resulted when a phrase was split so that half 
of it occurred on one line and half on the line just below. 

"Each of the 27 children in Miss 
Brown’s room planned to bring 250 
pounds of newspaper ..." 

was changed to: 

"Each of the 27 children in Miss Brown’s room 
planned to bring 250 pounds of newspaper..." 

5. Final sentences such as "Hew much did they both have," or 

"how much did they have together" offered real cues as to what operation 
the problem calls for. However, sentences such as "What was their net 
gain in yardage" left the students without the slightest idea of what 
operation to use. This suggested the possibility that the presence or 
absence of a key word might be a powerful index of item difficulty. 
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6. A problem such as the following could be solved in several 

ways: 

’’Paul delivered l40 papers. Of these he delivered 6l 
on Poplar Street, 58 on Garfield Ave., and the rest on 
York Road. How many did he deliver on York Road..." 

It could be solved: 

(l40 - 61) - 58 or l40 - (61 + 58). 

The interesting finding was that all three students used the latter 
approach in solving this type of problem. 

7. Two out of three students could not solve the following 
problem: 

"Steve has 15 toy soldiers, Tom has I8 

and Richard has 4l. "What is the average number 

of toy soldiers..." 

Their understanding of the concept of average was unsatisfactory. Some 
brief instructions were included in the instruction set of the program 
to teach the students how to do such problems, since most of the division 
problems recjuired understanding of averaging. 
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III. RESULTS 



In this section, the main task is to report the predictive worth of 

the six variables described earlier. The objective is successfully to 

predict the probability of a correct response for each item. The first 

step in analysis was to, .obtain regression coefficients for each of the 

factors. A multiple linear regression analysis program, adapted for the 

PDP-1 computer at Stanford, was used to obtain regression coefficients, 

2 

multiple correlation R and R . The regression equation was 

* 

z. - + , 87 X., + ,i8x_ + ,02X.^ + P.i^xt, + . 26 x.^ + l.42X?^ . 

1 il i2 i5 i5 ID 



(* indicates significance) 



1 



with a multiple R of ,6'Jf a standard error of estimate of 1.75^ snd an 
2 

R of . 45 . The results obtained from this model were reasonably success- 
ful, considering the complexity of the problems. 

From scanning the coefficients, we see that X|^, the sequential variable, 
is the most important pf the six variables. The other weightings indicate 
that the conversion variable, Xg, and the operations variable, X^, are 
valuable predictors of the probability of a correct response for each 

item. A rough indication of the goodness of fit of the regression line 

2 

is given by the multiple c.orrelation coefficient R and its square, R , 
which is an estimate of the amount of variance accounted for by the 
regression model. In this case, 45 percent of the variance in probability 
of a correct response is accounted for by the model. 

Figure 2 presents a graph of the predicted and observed proportions 
of correct responses for each of the 68 items. The probabilities are 
plotted as a function of the rank of observed proportion of correct responses. 



Insert Figure 2 about here 
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Consequently^ the curve of the observed probabilities is monotonically 

decreasing and smoother than the predicted curve. An inspection of the 

two curves shows a reasonable fit for the regression model, especially in 

view of the heterogeneity of problem types. For an analysis of goodness 

of fit of the probability of a correct response predicted from the regres- 

Sion model and the observed probability of a correct response, a computer 

program was written to calculate the predicted probability, p., of a 

T 2 

correct response for problem i, and to give as a measure of fit X ^ where 



X' 



S (f . - P.N)^ / [p.(l - p.) 



N] 



and f. = observed frequency of correct response, N = number of students. 

^ o 

For the above model, X = 555.76. 

2 

This rather high value for X is an indication of a poor fit, but 
a closer look at the components of X shows that a few problems made ex- 
tremely large contributions to the total X^, The following problem, for 

2 

example, contributed 26 percent to the total X obtained: 

”A school playground is rectangular, 275 feet long 
and 21 feet wide. "What is the total length of the fence 
varound the playground..." 

The observed proportion of correct responses for this item was .59^ while 

the predicted proportion was .97; clearly, this is a very poor fit. As a 

second example, the following problem contributed I6 percent to the total 
2 

X obtained. 

"Mary is twice as old as Betty was 2 years ago, 

Mary is years old. How old is Betty..." 

A reduction in X^, obtained by deleting those few extreme problems, 

2 

is still insufficient to yield a value of X such that the model would 
normally be accepted. An analysis on a reduced set of data is suggestive, 
however, and useful. This reduced set excludes seven of the problems 
that have extreme individual X contributions. Since calculation of the 
regression coefficients included the extreme problems, a recalculation 
of the regression coefficients omitting these problems from the data 
yields better fits of the model to data than those previously obtained. 
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We emphasize that this procedure of dropping individual problems 
with large values is certainly not admissible as an inference procedure. 

p 

The X values reported here provide a useful descriptive statistic for 
summarizing the order of magnitude of deviations between the observed and 
predicted results for the bulk of the problems, and for identifying types 
of problems, such as those two just mentioned, that require a more elaborate 
theory. 

The regression equation for the reduced set of 65 problems, 

z = -7.85 + .78x*^ + -29X^2 + .02X^J + 2.55X*1^ + .2TX.5 + 1.33X*g 

(* indicates significance) 

has a multiple R of .73, with a standard error of estimate of 1.59^ and 
of .53» For this reduced set, X^ = 168.51* 

Consideration of the partial correlation coefficients indicates that 
most of the variance can be accounted for by X^, X^, X^, and Xg. If we 
reduce the number of variables in the regression equation to include only 
these, the reduction in multiple R and R is very slight. Considering 
only these four variables, the regression equation (for the 63 problems) 

I 

becomes 

= -7.55 + •90X*J^ + .30x^2 2.42X*1|^ + 1.3‘^X*g 

2 

with a multiple R of . 72 , a standard error of estimate of 1 . 58 ^ and R 
of . 52 . For this model, X^ = 178.33. 
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IVc DISCUSSION 

Although the predictive results of our first relatively crude analysis 
of the data are far from what we ultimately hope to be able to offer, they 
are somewhat promisingo There is considerable difficulty in intuitively 
rank-ordering the expected proportions of correct responses obtained in 
word problems 0 We believe that our results give a sense of the real pos- 
sibility of analyzing and predicting in terms of meaningful variables, 
the response performance of children who are solving arithmetical word 
problems 0 At first glance, the problem set appears to be quite complex., 
Yet, with a few variables we have brought a considerable amount of order 
to ito The most suggestive single finding is probably the importance of 
the sequential variable in all the analyses o It is significant beyond 
the oOOl level, indicating that it is clearly an important variable con- 
tributing to problem difficulty., 

The relatively subtle results obtained in this first study give a 
clear indication of the difficulty in building a processing model, or, 
to put it another way, in constructing an explanatory theory that is 
adequate to account for all the difficulties students encounter in solving 
word problems,, From a theoretical standpoint, it is apparent that nothing 
short of a full syntactic and semantic analysis will suffice to predict 
all the details that must be accounted for in the bahavior of students. 

Even then, it is not simply a matter of an abstract syntax and semantics 
for some significant portion of English or another natural language; it 
is a matter of having a behavi orally sensitive syntax and semantics. 

Many mathematicians concerned with mathematics education perhaps do not 
appreciate sufficiently that until better fundamental theories are avail- 
able, certain directions of deeper progress in mathematics education are 
hardly possible. The present study was meant to be a very modest step 
in a direction of much needed additional research. 
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places in the text, and also to Mr. Roulette Smith for programming 
assistance . 

P 

“To take care of the case when the observed p^ is either 0 or 1, we 
use the following transformation 

{ log (2n^ - 1) for p^ = 0 

log ^ for p. = 1 , 

i 



where n. = the total number of subjects responding to item i. The 
1 

exact form of this transformation is not important. 

^To avoid any ambiguity, we always first minimize the number of steps 
and then the number of operations. 

^The reason for designing the program in this way was to reduce the 
time required for students to input large numbers themselves. 
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